Genetic diversity analysis of highly incomplete SNP genotype data with 6 imputations : an empirical assessment

نویسنده

  • Yong-Bi Fu
چکیده

17 Genotyping by sequencing (GBS) has recently emerged as a promising genomic approach for 18 assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the 19 uniquely large unbalance in GBS genotype data. While some genotype imputation has been 20 proposed to infer missing observations, little is known about the reliability of a genetic diversity 21 analysis of GBS data with up to 90% of observations missing. Here we performed an empirical 22 assessment of accuracy in genetic diversity analysis of highly incomplete SNP genotypes with 23 imputations. Three large SNP genotype data sets for corn, wheat and rice were acquired, and 24 missing data with up to 90% of missing observations were randomly generated and then imputed 25 for missing genotypes with three map-independent imputation methods. Estimating 26 heterozygosity and inbreeding coefficient from original, missing and imputed data revealed 27 variable patterns of bias from assessed levels of missingness and genotype imputation, but the 28 estimation biases were smaller for missing data without genotype imputation. The estimates of 29 genetic differentiation were rather robust up to 90% of missing observations, but became 30 substantially biased when missing genotypes were imputed. The estimates of topology accuracy 31 for four representative samples of interested groups were generally reduced with increased levels 32 of missing genotypes. Probabilistic principal component analysis based imputation performed 33 better in terms of topology accuracy than those analyses of missing data without genotype 34 imputation. These findings are not only significant for understanding the reliability of the genetic 35 diversity analysis with respect to large missing data and genotype imputation, but also are 36 instructive for performing a proper genetic diversity analysis of highly incomplete GBS or other 37 genotype data. 38

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic Diversity Analysis of Highly Incomplete SNP Genotype Data with Imputations: An Empirical Assessment

Genotyping by sequencing (GBS) recently has emerged as a promising genomic approach for assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the uniquely large unbalance in GBS genotype data. Although some genotype imputation has been proposed to infer missing observations, little is known about the reliability of a genetic diversity analysis of GBS data, ...

متن کامل

Assessment of the genetic diversity among potato cultivars from different geographical areas using the genomic and EST microsatellites

Background: Potato has a narrow genetic base which is due to its development, as it takes its genetic root from a few genotypes originated from South America. Objectives: The objective of this study was to assess the genetic relationships among potato (Solanum tuberosum L.) genotypes originated from different geographical regions.Materials and Methods: This study has rendered 25 use...

متن کامل

A Review of Microsatellite Marker Usage in the Assessment of Genetic Diversity of Camelus

Camels have been regarded as the desert ship and they play multi-utility role in the world. Estimation of genetic parameters is foremost step towards managing the genetic resources for their conservation and sustainable utilization. Microsatellite markers have been extensively used in cattle, sheep, goat and camels. However, genetic characterization studies on camels has been poorly recorded. T...

متن کامل

Genotyping-By-Sequencing for Plant Genetic Diversity Analysis: A Lab Guide for SNP Genotyping

Genotyping-by-sequencing (GBS) has recently emerged as a promising genomic approach for exploring plant genetic diversity on a genome-wide scale. However, many uncertainties and challenges remain in the application of GBS, particularly in non-model species. Here, we present a GBS protocol we developed and use for plant genetic diversity analysis. It uses two restriction enzymes to reduce genome...

متن کامل

Evaluation of ten SNP Markers for Human Identification and Paternity Analysis in Persian Population

Background: DNA markers are inevitable tools of human identification in forensic science. Single Nucleotide Polymorphisms (SNPs) are one category of these markers which is concerned to use especially in the case of degraded DNA because of their short amplicons. Objectives: Detection of highly informative SNPs by the criteria is the essential step to devel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014